DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries
Identifieur interne : 001316 ( Main/Exploration ); précédent : 001315; suivant : 001317DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries
Auteurs : Alejandro Bia [Espagne] ; Jaime G Mez [Espagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2005.
Descripteurs français
- Pascal (Inist)
- ., Analyse coût, Bibliothèque électronique, Coût développement, Coût production, Développement logiciel, Economie, Estimation a priori, Génie logiciel, Image numérique, Internet, Langage HTML, Langage XML, Modélisation, Méthode raffinement, Numérisation, Processus fabrication, Reconnaissance caractère, Reconnaissance optique caractère, Retard, Réseau web, Sciences économiques, Texte, Télécopie.
- Wicri :
- topic : Génie logiciel, Numérisation, Télécopie.
English descriptors
- KwdEn :
- A priori estimation, Character recognition, Cost analysis, Delay, Development cost, Digital image, Digitizing, Economics, Economy, Electronic library, Facsimile, HTML language, Internet, Modeling, Optical character recognition, Production cost, Production process, Refinement method, Software development, Software engineering, Text, World wide web, XML language.
Abstract
Abstract: The estimate of web-content production costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and times involved in the development of their contents. As it happens with software development projects, incorrect estimates give way to delays and costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO [1]) and Function Points [2], and using historical data gathered during five years of work at the Miguel de Cervantes Digital Library, where more than 12.000 books were digitized, we have refined an equation for digitization cost estimates named DiCoMo (Digitization Cost Model). This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning plus OCR and human proofreading, or the production of digital facsimiles (scanning images without OCR). The estimates done a priori are improved as the project evolves by means of adjustments based on real data obtained from previous stages of the production process. Each estimate is a refinement obtained as a result of the work done so far.
Url:
DOI: 10.1007/11551362_62
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001313
- to stream Istex, to step Curation: 001235
- to stream Istex, to step Checkpoint: 000C25
- to stream Main, to step Merge: 001352
- to stream PascalFrancis, to step Corpus: 000435
- to stream PascalFrancis, to step Curation: 000352
- to stream PascalFrancis, to step Checkpoint: 000416
- to stream Main, to step Merge: 001446
- to stream Main, to step Curation: 001316
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries</title>
<author><name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
</author>
<author><name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:903552E0AC429A3EDE9A54A2786AD96887283EBC</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1007/11551362_62</idno>
<idno type="url">https://api.istex.fr/document/903552E0AC429A3EDE9A54A2786AD96887283EBC/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001313</idno>
<idno type="wicri:Area/Istex/Curation">001235</idno>
<idno type="wicri:Area/Istex/Checkpoint">000C25</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Bia A:dicomo:an:algorithm</idno>
<idno type="wicri:Area/Main/Merge">001352</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:05-0445900</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000435</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000352</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000416</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Bia A:dicomo:an:algorithm</idno>
<idno type="wicri:Area/Main/Merge">001446</idno>
<idno type="wicri:Area/Main/Curation">001316</idno>
<idno type="wicri:Area/Main/Exploration">001316</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries</title>
<author><name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
<affiliation wicri:level="1"><country xml:lang="fr">Espagne</country>
<wicri:regionArea>Miguel Hernández University</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author><name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
<affiliation wicri:level="1"><country xml:lang="fr">Espagne</country>
<wicri:regionArea>University of Alicante</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2005</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">903552E0AC429A3EDE9A54A2786AD96887283EBC</idno>
<idno type="DOI">10.1007/11551362_62</idno>
<idno type="ChapterID">62</idno>
<idno type="ChapterID">Chap62</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>A priori estimation</term>
<term>Character recognition</term>
<term>Cost analysis</term>
<term>Delay</term>
<term>Development cost</term>
<term>Digital image</term>
<term>Digitizing</term>
<term>Economics</term>
<term>Economy</term>
<term>Electronic library</term>
<term>Facsimile</term>
<term>HTML language</term>
<term>Internet</term>
<term>Modeling</term>
<term>Optical character recognition</term>
<term>Production cost</term>
<term>Production process</term>
<term>Refinement method</term>
<term>Software development</term>
<term>Software engineering</term>
<term>Text</term>
<term>World wide web</term>
<term>XML language</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>.</term>
<term>Analyse coût</term>
<term>Bibliothèque électronique</term>
<term>Coût développement</term>
<term>Coût production</term>
<term>Développement logiciel</term>
<term>Economie</term>
<term>Estimation a priori</term>
<term>Génie logiciel</term>
<term>Image numérique</term>
<term>Internet</term>
<term>Langage HTML</term>
<term>Langage XML</term>
<term>Modélisation</term>
<term>Méthode raffinement</term>
<term>Numérisation</term>
<term>Processus fabrication</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Retard</term>
<term>Réseau web</term>
<term>Sciences économiques</term>
<term>Texte</term>
<term>Télécopie</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Génie logiciel</term>
<term>Numérisation</term>
<term>Télécopie</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The estimate of web-content production costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and times involved in the development of their contents. As it happens with software development projects, incorrect estimates give way to delays and costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO [1]) and Function Points [2], and using historical data gathered during five years of work at the Miguel de Cervantes Digital Library, where more than 12.000 books were digitized, we have refined an equation for digitization cost estimates named DiCoMo (Digitization Cost Model). This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning plus OCR and human proofreading, or the production of digital facsimiles (scanning images without OCR). The estimates done a priori are improved as the project evolves by means of adjustments based on real data obtained from previous stages of the production process. Each estimate is a refinement obtained as a result of the work done so far.</div>
</front>
</TEI>
<affiliations><list><country><li>Espagne</li>
</country>
</list>
<tree><country name="Espagne"><noRegion><name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
</noRegion>
<name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001316 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001316 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:903552E0AC429A3EDE9A54A2786AD96887283EBC |texte= DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries }}
This area was generated with Dilib version V0.6.32. |